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19942178.1 filed on 3 September iggg the contPnt. nf.hi, h are nerfthv inrftrp nratoW h 
reference. 

BACKGRO UND OF THF INVENTION 

1. Field of the lnx/onti»n 

10002] The invention relates to a method for conditioning a database for automatic speech 
processing, as well as a method for training a neural network for assigning graphemes to 
phonemes for automatic speech processing, and a method for assigning graphemes to 
phonemes in the synthesization of speech or in the recognition of speech. 

2. Description of Win Related Art 

r0003] It is known to use neural networks for synthesizing speech, the neural networks 
converting a text, which is represented in a sequence of graphemes, into phonemes which are 
converted ,nto the corresponding acoustic sounds by an appropriate speech output device 
Graphemes are letters or combinations of letters which in each case are assigned a sound the 
phoneme. The neural network must be trained before being used for the first time This is 
normally performed by using a database which contains the grapheme/phoneme assignments it 
be.ng established thereby which phoneme is assigned to which grapheme. 

[0004] The setting up of such a database constitutes a substantial outlay on time and mental 
effort, s,nce databases of this type can usually only be constructed with the aid of a language 
expert. 

S UMMARY OF THF INIVFMTifiM 

[0005] The object of the invention is to create a method with the aid of which it is possible in 
a s.mple way to set up a database containing grapheme/phoneme assignments. [The object is 
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^e™d by , me anso,a m e tt ,odhav^ the(ealuresofdaim1 . 
the invention are specified in the subclaims.) 

100061 The method according to ,he invenlion tor conditioning a database for automatic 
speech processing precedes from a database which contains words in me form of graphemes 
andphonemes. Such databases already axis, for mostlanguages. These databases ana 
dictionanes which contain the words in scrip, (graphemes, and in phonetic transcrfption 

^nT H0 T' dattbases ,ack *• assl9nment °' me <° 

corresponding graphemes. This assignment is executed automatically according to me 
invention by [means ol) the following steps: 

e) assigning the graphemes to the phonemes ot all the woios which have the same 

b) assigning the graphemes to the phonemes of all the wonte which have more 
graphemes than phonemes, all the graphemes fftsfty being assigned to the phonemes in pairs 
una. an assume™ error arises on the basis or ,he assignments determined hitherto o, there 

are present only a, the and of the word one or more graphemes ,o which no phoneme I....-- 

assigned. .and combining a plurality of graphemes to fern a grapheme unit and assigning a 
grapheme to Ihe phoneme unit, and 

c) assigning me graphemes to the phonemes of all the worts which have fewer 
graphemes man phonemes, a plurality of phonemes being combined to form a phoneme unit 
and a single grapheme being assigned ,o Oram in such a way fhat the remaining " ' 
grapheme/phoneme assignments of the word to be analyzed correspond to the assignments 
found under a) and b), 

d) assigning h. words hitherto not ass&nabte. the words being examined in terms of 
Ihe phoneme units determined under c) andfor Una grapheme units determined under b) and 
the phcoemes are assorted to me graphemes white UMng account of the phoneme unit and/or 
grapheme units, and there being executed a. least after slep a) a correction step with the aid of 
wftich assignments of words «Wch contradict the further assignments determined in step a) are ■ 
srasGo. 



[0007J According to the invention, the first step is to examine words which have the «„ 
number of graphemes and phonemes. The graphemes of these words are assigned to the 



same 
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phonemes in pairs, the assignments of the words which contradict the further assignments 
being erased in a correction step following thereupon. 

[0008J A large number of the words can be processed with the aid of this first assignment 
operat.cn and. in addition, statistically significant assignments can be achieved which permit 
checking in the correction step and which also permit checking of the further assignments to be 
set up in the subsequent steps. 

[0009] Thereafter, those words are examined in the case of which the number of phonemes 
differs from the number of graphemes. In the case of words with more graphemes than 
phonemes, a plurality of graphemes are combined to form grapheme units, and phonemes are 
combined to form phoneme units in the case of words with fewer graphemes than phonemes. 

[0010] After termination of these steps, the words not hitherto assignable are examined 
account being taken in this case of the determined phoneme units and/or the determined ' 
grapheme units. 

[0011] Consequently, the method according to the invention is used to set up step by step an 
-assignment knowledge" which is based initially on pairwise grapheme/phoneme, assignments 
and ,nto which grapheme- units and phoneme units are also incorporated in the course of the 
method. 

[0012] The method according to the invention can be applied to any desired language for 
wh,ch there already exists an electronically readable database which contains words in the form 
of graphemes and phonemes, there being no need for an assignment between the phonemes 
and graphemes. The use of expert knowledge is not necessary, since the method according to 
the invention is executed fully automatically. 

[0013] It is then possible to use the database set up according to the invention to train a 
neural network with the aid of which the grapheme/phoneme assignments are executed 
automatically in synthesizing or recognizing speech. 

BRIEF DESC RIPTION OF THE DRAWING S 

[0014] [The] These and other obje cts and ad v antag es of the present invention [is explained 
below in] will become more [detail with the aid of an exemplary embodiment that is illustrated in] 
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a rrant and mnr« r^y SWSm^mMJ mm description nf , hQ pm ,^ 
§QMMfe ^^ the accomfian^drawings. in which: 

F f 6 1 ^ i^wchartof an exemplary embodiment of the method according to 
the invention (in a flowchart], «*««Mng 10 

Figure 2 [shows] is a [schematic] bip^kdi^ram of a neural network for assigning 
graphemes to phonemes, and 9 

Figure 3 [shows] is a [schematic] bloci^ram of a device for carrying out the method 
according to the invention. a«uimememoa 

DETAILED DESCRIPTION! nP rue pp FFFRRED FMRnn.MCMT 



I0016J ^-thodac^rdingtotheinventionservesforcondition^ 

syntheses, the starting point being an initial database that contains words in the form of 

r s ^r san h d T mes - su * anini ^^ 

enp (oraphen^ 

-notccntainahassignmentoftheindividualgrapheme^to ^ 
purpose and aim of the method according to the invention is to set up such an assignment. 

S - A " eXemP ^ emb ° diment ° f the method a ^ing to the invention is i.lustrated in a 
flowchart .n figure 1. The method is started in a step 31. 

10018] Step S2 examines all words that have the same number of graphemes and 

Phoneme, The graphemes of these words are assigned to the corresponding phonemes in 
pairs. 

[00191 Such a painvise assignment is executed, for example, for the English word Tun" 
wh,o can be represented in the following way w,.h me aid of its graphemes and phonemes: 



Graphemes: 
Phonemes: 



run 
r An 



10020, ,„ ^ case of Tun". „ grapheme r „ assigned „ ^ phoneme r ^ 

« >o the phoneme "A", and the grapheme V to the phoneme V. In the ease of mis pairwise 
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assignment, each individual grapheme is therefore respectively assigned to a single phoneme 
Th.s is executed for all words that have the same number of phonemes and graphemes. 

10021] m the subsequent step S3, a correction is executed which erases the assignments of 
the words that contradict the further assignments determined in step S2. For this purpose the 
frequencies of the individual grapheme/phoneme assignments are detected, and 
grapheme/phoneme assignments which only seldom occur are erased. If the frequency of a 
specific grapheme/phoneme assignment is below a predetermined threshold value, the 
corresponding grapheme/phoneme assignments are erased. The threshold value is. for 
example, in the range of frequency from 10 to 100. The threshold value can be adjusted as 
appropriate depending on the size of the vocabulary of the initial database, a higher threshold 
value being expedient in the case of larger initial databases than in the case of smaller initial 
databases. 

[0022] An example of such a contradictory grapheme/phoneme assignment is the English 
word "fire": 



Graphemes: fire 

Phonemes: f l@ r 

[0023] The assignment of the grapheme r to the phoneme "@", and the assignment of the 
grapheme V to the phoneme r are incorrect. These two assignments occur very seldom, for 
which reason their frequency is lower than the threshold value, and so they are erased in step 
S3. In addition, the word "fire" is marked again in step S3 as non-assigned, so that it can be re- 
examined in a later assignment step. 

[0024] Words which have more graphemes than phonemes are examined in step S4. in each 
case one grapheme being assigned to one phoneme in the reading direction (from left to right), 
and the remaining graphemes being combined to form a grapheme unit with the last grapheme 
that has been assigned to a phoneme. The example of a word that is correctly assigned in this 
way is the English word "aback": 

Graphemes: a b a ck 
Phonemes: x b @ k 
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100251 ,n Step SS fo,,owin 9 thereu P°". a correction is executed in turn with the aid of which 
ass,gnments are erased that contradict the assignments determined hitherto, that is to say 
ass.gnments that have only a low frequency. Step S5 is therefore identical to step S3. 

[0026J In step S6. the words that have more graphemes than phonemes and could not be 
correctly assigned in step S4 are examined anew, an individual grapheme being assigned in 
each case to an individua! phoneme in the reading direction (from left to right). Each individual 
ass,gnment is checked as to whether it corresponds to the assignments determined hitherto .f 
th,s checking reviews that a grapheme/phoneme assignment does not correspond to the 
previous assignments, that is to say does not have the required frequency, the method reverts 
to the last grapheme/phoneme assignment and joins the grapheme of this grapheme/phoneme 
ass.gnment to the next grapheme in the reading direction to form a grapheme unit The 
rema.n.ng phonemes and graphemes are then assigned to one another again individually each 
md.v.dual grapheme/phoneme assignment being checked, in turn. 

[0027] One or more grapheme units can be generated inside a word during this method step 
the grapheme units comprising two graphemes as a rule. However, it is also possible that the ' 
grapheme units can comprise three or more graphemes.. 

[0028] A word in which step S6 leads to a successful assignment is. for example, the English 
word "abasement": y 

Graphemes: a b a se m e n t 
Phonemes: xbesmint 

[0029J m the case of "abasement", the pairwise assignment proceeds correctly up to the 
grapheme V. which is firstly assigned to the phoneme "rrf. This assignment contradicts the 
ass,gnments determined hitherto, for which reason the method converts to the last successful 
assignment of the grapheme V to the phoneme V. and joins the grapheme V with the 
grapheme V to form the grapheme unit "so". Jhe further pairwise assignment of the 
graphemes to the phonemes corresponds again to the assignments determined hitherto for 
which reason they are executed correspondingly. 

[0030] The words that were examined in step S6 and have not been assigned with complete 
success are marked in step S7. and their assignments are erased, in turn. 
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r0031] In step S8. the words that have more graphemes than phonemes and could not be 
correctly assigned in steps S4 and S6 are examined anew, an individual grapheme being 
assigned in each case to an individual phoneme firstiy in the reading direction (from left to right) 
Each .nd^dua assignment is checked, in turn, as to whether it corresponds to the assignments 
determ,ned hrtherto. If this check shows that a grapheme/phoneme assignment does not 
correspond to the previous assignments, that is to say that the number of the frequency is be.ow 
the predeterm,ned threshoid value, individual graphemes are assigned to individual phonemes 
counter to the reading direction (from right to .eft). «f. in the case of this method only one 
Phoneme is left over that cannot be assigned a grapheme, the remaining graphemes are 
combined to form a grapheme unit and assigned to the one phoneme. 

10032] A grapheme unit can be generated inside a word in this method step. 

[00333 A word in the case of which step S8 leads to a successful assignment is. for example, 
the English word "amongst": 

Graphemes: amongst 
Phonemes: xm AGst 

- [0034] in the case of -amongsr; the pairwise assignment from left to right is performed 
correctly up to the grapheme V. which is firstly assigned to the phoneme »G'. This assignment 
contacts the assignments determined hitherto, for which reason a pairwise assignment is 
executed from right to left. This assignment proceeds correctly up to the grapheme «g". which is 

IT! T" t0 ^ Ph ° neme ^ ^ aSSi9nment *• determined 
hrtherto. The phoneme "G" is left over as the only phoneme that cannot be assigned to a 

grapheme. This phoneme "G» is now assigned to the remaining graphemes V and Vl which 
are combined to form a grapheme unit. 

[0035] The words examined in step 88. which have not been assigned with complete 
success, are marked in step S9 and their assignments are erased, in turn. 

[0036] The words that have fewer graphemes than phonemes are examined in step si 0 the 
md.wdua. graphemes being assigned in pairs to the individual phonemes, the graphemes aiso 
oe.ng assigned to the phonemes adjacent to the assigned phonemes. The respective 
frequency of all these assignments is determined, and if it is established that a grapheme can 
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be assigned to the two adjacent phonemes with a high frequency, these two phonemes are 
combined to form a phoneme unit if the two phonemes are two vowels or two consonants. 

[0037] A word in which step [S9] SJO leads to a correct assignment is. for example, the 
English word "axes": 

Graphemes: axes 
Phonemes: @ ks i z 

[0038] In the case of "axes", the assignments of the grapheme V to the phonemes "k" and 
"s" respectively yields a frequency that is above a predetermined threshold value, so that these 
two phonemes are combined to form the phoneme unit "ks". The remaining graphemes and 
phonemes are assigned in pairs, in turn. 

[0039] It is also possible in step S10 that a plurality of phoneme units are formed, or that the 
phoneme units also comprise more than two phonemes. 

[0040] A correction is carried out in turn in step S1 1 in the case of which the assignments 
that seldom occur are erased, and the words in which these contradictory_assignments have- - 
_ been established are marked as ribn^asslghed: Step S^'corresponds essentially to steps S3 
and S5. although in this case account is also taken of the grapheme/phoneme assignments 
determined up to step S10. 

[0041] Step S12 corresponds essentially to step S10. that is to say phoneme units are 
formed from adjacent phonemes, the phoneme units not being limited in step S12 to two 
consonants or two vowels, but also being capable of containing a mixture of vowels and 
consonants. 

[0042] A correction operation that corresponds to step S11 is carried out in turn in step S13. 
account being taken of all grapheme/phoneme assignments determined in the interim. 

[0043] The phoneme units determined in steps S10 and S12 are used in step S14 in order to 
re-examine words whose graphemes could not be correctly assigned to the phonemes, use 
being made, for adjacent phonemes, of a phoneme unit that exists for them already. It is also 
possible as an option to take account of the previously determined grapheme units. Should no 
use be made of this option, grapheme units can be formed here anew in accordance with the 
methods according to steps S4, S6 and S8. 
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[0044] A word that shows the assignment in accordance with step S14 is the English word 
"accumulated": 

Graphemes: accumulated 
Phonemes: x k yu m yx I e 1 1 d 

[0045] In the case of this word, the phonemes y and V or y and V are initially replaced 
by the phoneme units "yu" and "yx". respectively. Since these phoneme units have already 
been determined in the preceding steps, use is made in step S14 of the option that it is also 
possible to take account of the grapheme units, and so the grapheme unit "cc" is used for the 
two graphemes V and V. The pairwise assignments of the individual graphemes or grapheme 
units to the individual phonemes or phoneme units yields a correct assignment. 

[0046] If no use is made of the option of taking account of the grapheme units then, as is the 
case in step S6. the individual graphemes are assigned to the individual phonemes or phoneme 
units, an assignment contradicting the previous assignments occurring in the present case with 
the assignment of the grapheme V to the phoneme unit "yu". This contradictory assignment is 
established, and the grapheme "c" is combined with the preceding grapheme "c" to form "cc". 
This leads, in turn, to a correct assignment of the graphemes to the phonemes, 

[0047] A check is made, in turn, in step S15 as to whether contradictory assignments have 
arisen. If such contradictory assignments are established, they are erased together with the 
further assignments of the respective word. 

[0048] The method is terminated with the step S1 6. 

[0049] The number of the contradictory assignments determined in step S1 5 is a feature of 
the quality of the conditioning of the initial database, obtained by the method, with the individual 
grapheme/phoneme assignments. 

[0050] It was already possible for the method according to the invention to be used very 
successfully in automatically setting up a database for the German language, an assignment 
database with a total of 47 phonemes and 92 graphemes having been constructed. In setting 
up the database for the English language, which has a substantially more complicated 
grapheme/phoneme assignment, 62 phonemes and 222 graphemes resulted whose 
assignments are not as good as in the case of the German language. The larger number of 
graphemes in the English language complicates their processing. It can therefore be expedient 
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to introduce a zero phoneme, that is to say a phoneme without a sound. Such a zero phoneme 
can be assigned, for example, to the English grapheme unit "gh". which occurs in the English 
language in a voiceless fashion in combination with the graphemes "ei", "ou" and "au". If no 
such zero phoneme was introduced, it would be necessary for the phonemes "eigh", 'ough" and 
"augh" to be introduced in addition to the graphemes "ei". "ou" and "au". The zero phoneme 
permits a reduction in the number of the graphemes, since "eigh". "ough" and "augh" can be 
replaced respectively by "ei", "ou" and "au" in combination with "gh". The reliability of the 
method can be raised thereby. In particular, a smaller number of phonemes and/or graphemes 
permits a simpler, faster and more reliable application in the case of a neural network that is 
trained by [means of] the database set up with the aid of the method according to the invention. 

[0051] Such a neural network, which has five input nodes and two output nodes, is illustrated 
schematically in a simplified fashion in figure 2. Three consecutive letters B1, B2 and B3 of a 
word that is to be converted into phonemes are input at three of the five input nodes. There are 
two nodes on the output side, one of the two outputting the respective phoneme Ph. and the 
other node outputting a grouping Gr. The grouping GR, last output and the phoneme Ph, last 
output are input at the two further input nodes. 

[0052] This network is trained with the words of the database conditioned" using the method 
according to the invention, the grapheme/phoneme assignments of which database do not 
constitute a contradiction to the remaining grapheme/phoneme assignments, that is to say the 
words whose graphemes could be correctly assigned to the phonemes. 

[0053] The neural network determines a phoneme for the middle letter B2 in each case, 
account being taken of the respectively preceding letter and subsequent letter in the context, 
and of the phoneme Ph, preceding the phoneme to be determined. If the two consecutive 
letters B2 and B3 constitute a grapheme unit, the result is an output of two for the grouping Gr. 
If the letter B2 is not a constituent of a grapheme unit consisting of a plurality of letters, a one is 
output as grouping Gr. 

[0054] Account is taken of the respectively last grouping Gr, on the input side, no phoneme 
Ph being assigned to the middle letter B2 in the case of a grouping of Gr, of two, since this letter 
has already been taken into account with the last grapheme unit. The second letter of the 
grouping is skipped in this case. 



10 



.(tea? 



Q 



[0055] During training of the neural network, the values for the input nodes and for the output 
nodes are. as is known per se. prescribed for the neural network, as a result of which the neural 
network acquires the respective assignments in the context of the words. 

[0056] It can be expedient to provide more than three letters at the input side of the neural 
network, in particular in the case of languages such as the English janguage in which a plurality 
of letters are used to represent a single sound. For the German language it is expedient to 
provide three or five nodes at the input side for inputting letters, whereas for the English 
language five, seven or even nine nodes can be expedient for inputting letters. Grapheme units 
with up to five letters can be handled given nine nodes. 

[0057] Once the neural network has been trained with the database according to the 
invention, it can be used for generating language automatically. A device for generating 
language in which the neural network according to the invention can be used is shown 
schematically in figure 3. 

[0058] This device is an electronic data processing device 1 with an internal bus 2, to which a 
central processor unit 3, a memory unit 4. an interface 5 and an acoustic output unit 6 are 
connected. The interface 5 can make a connection to a further electronic data processing 
device via a data line 8: A loudspeaker 7 is connected to the acoustic output unit 6. 

[0059] The neural network according to the invention is stored in the memory unit 4 in the 
form of a computer program that can be run by [means of] the central processor unit 3. A text 
which is fed to the electronic data processing device in any desired way. for example, via the 
interface 5. can then be fed with the aid of an appropriate auxiliary program to the neural 
network that converts the graphemes or letters of the text into corresponding phonemes. These 
phonemes are stored in a phoneme file that is forwarded via the internal bus 2 to the acoustic 
output unit 6 with the aid of which the individual phonemes are converted into electric signals 
that are converted into acoustic signals by the loudspeaker 7. 

[0060] The method according to the invention for conditioning a database can also be 
designed with the aid of such an electronic processing device 1. the method being stored, 
again, in the form of a computer program in the memory 4. and being run by the central 
processor unit 3. in which case it conditions an initial database that represents a dictionary in 
script and phonetic transcription, into a database in which the individual sounds, the phonemes, 
are assigned to the individual letters or letter combinations, the graphemes. 
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[0061] The assignment of the individual graphemes to the individual phonemes can be stored 
in the conditioned database by blank characters that are inserted between the individual 
phonemes and graphemes. 

[0062] The computer programs representing the method according to the invention and the 
neural network can also be stored on any desired electronically readable data media, and thus 
be transmitted to a further electric data processing device. 

[0063] The invention is described above with the aid of an exemplary embodiment with the 
aid of which a database for speech synthesis is generated. Of course, it is also possible within 
the scope of the invention to use the database generated according to the invention in speech 
recognition, since speech recognition methods frequently use databases with 
grapheme/phoneme assignments. 

[0064] Speech recognition can be executed, for example, with the aid of a neural network 
that has been trained with the database set up according to the invention. At the input side, this 
neural network preferably has three input nodes at which the phoneme converted into a 
grapheme is input and, if it is present, at least one phoneme preceding in the word and one 
subsequent phoneme are input. At the output side, the neural network has a node at which the 
grapheme assigned to me phoneme is output. 

[0065] Thus, the scope of the invention covers any application of the setting up and use of 
the database set up according to the invention in the field of automatic speech processing. 

100661 The invention has been described in detail with p ar ticular referent tn perron 
embodiments thereof and examples but it will h» , ,r,rW StQO d that varies and mnriffiretinnc 
can be effected within the spirit and scnpe of the invpntinn 
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